skip to main content
10.1145/2038916.2038919acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article

Modeling and synthesizing task placement constraints in Google compute clusters

Published:26 October 2011Publication History

ABSTRACT

Evaluating the performance of large compute clusters requires benchmarks with representative workloads. At Google, performance benchmarks are used to obtain performance metrics such as task scheduling delays and machine resource utilizations to assess changes in application codes, machine configurations, and scheduling algorithms. Existing approaches to workload characterization for high performance computing and grids focus on task resource requirements for CPU, memory, disk, I/O, network, etc. Such resource requirements address how much resource is consumed by a task. However, in addition to resource requirements, Google workloads commonly include task placement constraints that determine which machine resources are consumed by tasks. Task placement constraints arise because of task dependencies such as those related to hardware architecture and kernel version.

This paper develops methodologies for incorporating task placement constraints and machine properties into performance benchmarks of large compute clusters. Our studies of Google compute clusters show that constraints increase average task scheduling delays by a factor of 2 to 6, which often results in tens of minutes of additional task wait time. To understand why, we extend the concept of resource utilization to include constraints by introducing a new metric, the Utilization Multiplier (UM). UM is the ratio of the resource utilization seen by tasks with a constraint to the average utilization of the resource. UM provides a simple model of the performance impact of constraints in that task scheduling delays increase with UM. Last, we describe how to synthesize representative task constraints and machine properties, and how to incorporate this synthesis into existing performance benchmarks. Using synthetic task constraints and machine properties generated by our methodology, we accurately reproduce performance metrics for benchmarks of Google compute clusters with a discrepancy of only 13% in task scheduling delay and 5% in resource utilization.

References

  1. M. F. Arlitt and C. L. Williamson. Web server workload characterization: the search for invariants. In Proceedings of the ACM SIGMETRICS international conference on Measurement and modeling of computer systems, 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. P. Barford and M. Crovella. Generating representative web workloads for network and server performance evaluation. In Proceedings of the ACM SIGMETRICS international conference on Measurement and modeling of computer systems, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. J. Brevik, D. Nurmi, and R. Wolski. Predicting bounds on queuing delay for batch-scheduled parallel machines. In Proceedings of the eleventh ACM SIGPLAN symposium on Principles and practice of parallel programming. ACM, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. M. Calzarossa and D. Ferrari. A sensitivity study of the clustering approach to workload modeling. SIGMETRICS Performance Evaluation Review, 1986. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. S. J. Chapin, W. Cirne, D. G. Feitelson, J. P. Jones, S. T. Leutenegger, U. Schwiegelshohn, W. Smith, and D. Talby. Benchmarks and standards for the evaluation of parallel job schedulers. In Proceedings of the Job Scheduling Strategies for Parallel Processing, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Y. Chen, A. S. Ganapathi, R. Griffith, and R. H. Katz. Analysis and lessons from a publicly available google cluster trace. Technical Report UCB/EECS-2010-95, UC Berkeley, 2010.Google ScholarGoogle Scholar
  7. Y. Chen, A. S. Ganapathi, R. Griffith, and R. H. Katz. A methodology for understanding mapreduce performance under diverse workloads. Technical Report UCB/EECS-2010-135, UC Berkeley, 2010.Google ScholarGoogle Scholar
  8. Y. Chen, A. S. Ganapathi, R. Griffith, and R. H. Katz. Towards understanding cloud performance tradeoffs using statistical workload analysis and replay. Technical Report UCB/EECS-2010-81, UC Berkeley, 2010.Google ScholarGoogle Scholar
  9. W. Cirne and F. Berman. A comprehensive model of the supercomputer workload. In Proceedings of the IEEE Workload Characterization Workshop, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Condor Project. University of Wisconsin, http://www.cs.wisc.edu/condor.Google ScholarGoogle Scholar
  11. B. F. Cooper, A. Silberstein, E. Tam, R. Ramakrishnan, and R. Sears. Benchmarking cloud serving systems with ycsb. In Proceedings of the 1st ACM symposium on Cloud computing, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. J. Dean and S. Ghemawat. Mapreduce: simplified data processing on large clusters. Commun. ACM, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Y. Denneulin, E. Romagnoli, and D. Trystram. A synthetic workload generator for cluster computing. Parallel and Distributed Processing Symposium, 2004.Google ScholarGoogle ScholarCross RefCross Ref
  14. D. Ersoz, M. S. Yousif, and C. R. Das. Characterizing network traffic in a cluster-based, multi-tier data center. Distributed Computing Systems, International Conference on, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. I. Foster and C. Kesselman. Globus: A metacomputing infrastructure toolkit. International Journal of Supercomputer Applications, 1996.Google ScholarGoogle Scholar
  16. A. S. Ganapathi, Y. Chen, A. Fox, R. H. Katz, and D. A. Patterson. Statistics-driven workload modeling for the cloud. Technical Report UCB/EECS-2009-160, UC Berkeley, 2009.Google ScholarGoogle Scholar
  17. J. A. Hartigan and M. A. Wong. A K-means clustering algorithm. Applied Statistics, 1979.Google ScholarGoogle Scholar
  18. IBM. IBM LoadLeveler. http://www.redbooks.ibm.com/redbooks/pdfs/sg246038.pdf.Google ScholarGoogle Scholar
  19. S. Kavulya, J. Tan, R. Gandhi, and P. Narasimhan. An analysis of traces from a production mapreduce cluster. IEEE Cluster Computing and the Grid, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. K. Kim, K. Jeon, H. Han, S.-g. Kim, H. Jung, and H. Y. Yeom. Mrbench: A benchmark for mapreduce framework. In Proceedings of the 2008 14th IEEE International Conference on Parallel and Distributed Systems, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. L. Kleinrock. Queueing Systems. Wiley-Interscience, 2nd edition, 1975.Google ScholarGoogle Scholar
  22. A. W. Leung, S. Pasupathy, G. Goodson, and E. L. Miller. Measurement and analysis of large-scale network file system workloads. In USENIX Annual Technical Conference, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. U. Lublin and D. G. Feitelson. The workload on parallel supercomputers: modeling the characteristics of rigid jobs. Journal of Parallel Distributed Computing, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. A. K. Mishra, J. L. Hellerstein, W. Cirne, and C. R. Das. Towards characterizing cloud backend workloads: insights from google compute clusters. SIGMETRICS Performance Evaluation Review, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. D. Nurmi, J. Brevik, and R. Wolski. Qbets: Queue bounds estimation from time series. In In Proceedings of Job Scheduling Strategies for Parallel Processing (JSSPP), 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. R. Raman, M. Livny, and M. Solomon. Matchmaking: Distributed resource management for high throughput computing. In Proceedings of the Seventh IEEE International Symposium on High Performance Distributed Computing, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. B. Schroeder, E. Pinheiro, and W.-D. Weber. Dram errors in the wild: a large-scale field study. In Proceedings of the eleventh international joint conference on Measurement and modeling of computer systems, SIGMETRICS, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. B. Sharma, V. Chudnovsky, H. J. L., R. Rifaat, and C. R. Das. Modeling and synthesizing task placement constraints in google compute clusters. Technical Report CSE#11-005, CSE Dept., Pennsylvania State University, 2011.Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. W. Smith, V. Taylor, and I. Foster. Using run-time predictions to estimate queue wait times and improve scheduler performance. In Scheduling Strategies for Parallel Processing. Springer-Verlag, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. W. Sobel, S. Subramanyam, A. Sucharitakul, J. Nguyen, H. Wong, A. Klepchukov, S. Patil, O. Fox, and D. Patterson. Cloudstone: Multi-platform, multi-language benchmark and measurement tools for web 2.0, 2008.Google ScholarGoogle Scholar
  31. D. Thain, T. Tannenbaum, and M. Livny. Distributed computing in practice: the condor experience. 1975.Google ScholarGoogle Scholar
  32. S. Zhou, X. Zheng, J. Wang, and P. Delisle. Utopia: a load sharing facility for large, heterogeneous distributed computer systems. Softw. Pract. Exper., 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Modeling and synthesizing task placement constraints in Google compute clusters

              Recommendations

              Comments

              Login options

              Check if you have access through your login credentials or your institution to get full access on this article.

              Sign in
              • Published in

                cover image ACM Conferences
                SOCC '11: Proceedings of the 2nd ACM Symposium on Cloud Computing
                October 2011
                377 pages
                ISBN:9781450309769
                DOI:10.1145/2038916

                Copyright © 2011 ACM

                Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

                Publisher

                Association for Computing Machinery

                New York, NY, United States

                Publication History

                • Published: 26 October 2011

                Permissions

                Request permissions about this article.

                Request Permissions

                Check for updates

                Qualifiers

                • research-article

                Acceptance Rates

                Overall Acceptance Rate169of722submissions,23%

              PDF Format

              View or Download as a PDF file.

              PDF

              eReader

              View online with eReader.

              eReader